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ABSTRACT 



A method constructs a super-resolution texture from a 
sequence of images of a non-rigid three-dimensional object. 
A shape of the object is represented as a matrix of vertices, 
and a basis of possible deformations of the object is repre- 
sented as a matrix of displacements of the 3D points, the 
matrices of 3D points and displacements form a model of the 
object in the video. A set of correspondences between the 
points in model and the object in the images is formed. The 
points in each image are connected using the set of corre- 
spondences to form a triangle texture mesh for each image. 
Each triangle mesh is warped to a common coordinate 
system while super-sampling texture in each image. The 
warped and super-sampled triangle meshes are averaged to 
produce the super-sampled texture of the object in the 
image. 

7 Claims, 3 Drawing Sheets 
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METHOD FOR EXTRACTING STATIC AND deformations. The points in the model are connected to form 

DYNAMIC SUPER-RESOLUTION TEXTURES a triangle or tetrahedron mesh, depending on whether the 

FORM A SEQUENCE OF IMAGES images are 2D or 3D. For each image, a set of correspon- 

dences between the points in the model and the object in the 
RELATED APPLICATION 5 image is formed. The correspondences are used to map the 

„. 1 « j ♦ 11 c ♦ * v c mesh into each image as a texture mesh. Each mesh, and the 

This application is related to U.S. patent application Ser. , , 4 , 6 , j . 

xt nnnni -7 «w j ou w , m m - c image texture the mesh covers, is warped to a common 

No. 09/791,117 "Modeling Shape, Motion, and Flexion of * 4 t , . . ' . *\ , . , 

Non-Rigid 3D Objects Directly from a Sequence of "ordtnate system resulUng m a texture that appears to be 

Images," filed concurrently by Brand on Feb. 22, 2001. a "^formation of the ^original image. Hie warp is done with 

& ' J J 10 super-sampling so that the resulting texture has many more 

FIELD OF THE INVENTION pixels than the original image. The warped and super- 
sampled textures are averaged to produce a static super- 

The present invention relates generally to computer sampled texture of the object in the image. Various weighted 

graphics, and more particularly to a method for extracting averages of the warped and super-sampled textures produce 

textures from a sequence of images. 15 dynamic textures that can vary according to the deformation 



BACKGROUND OF THE INVENTION 



and/or pose of the object. 

BRIEF DESCRIPTION OF THE DRAWINGS 



Texture-mapping is a well known method for adding 

detail to renderings of computer graphic scenes. During FIG. 1 is a diagram of an object modeled by a matrix of 

texture-mapping, textures are applied to a graphics model. 20 3D vertices that can be displaced for changes in shape and 

The model typically includes a set of 3D points and a pose; 

specification of the edges, surfaces, or volumes that connect FI ' G 2 is a of the equation which models the 

the points. If the rendering is volumetric, e.g., the rendering flexkm and posing of ^ ob j cct; ^ 

represents a solid object and not just its surface, then the rw „ . . „ ... - . . . 

4 *\ , . , n J T , J . . l t . c „ e FIG. 3 is a flow diagram of a texture-mapping method 

textures can be in 3D. Texture-mapping gives the illusion of 25 . f rr ^ 

4j4 ■, 5 • ft j u according to the invention, 

greater apparent detail than present m the model s geometry. & 

If the textures are extracted from photographs or images, the DETAILED DESCRIPTION OF THE 

rendered images can appear quite realistic. Textures can be PREFERRED EMBODIMENTS 

in the form of texture-maps, i.e., data, or texture-functions, , . m ^ H t . , 

i e procedures 30 shown m FIG. 1, the mvention expresses the shape of 

' , , . - a model 100 of a non-rigid (flexible) object 101 in a 

In general, textures and images are sparse samplings of a r • 1A i . „ „ t • . rp, 

u L ■ m i*. « . . * * . « r ^ sequence or images 103 by a matrix 01 points 1U2. Ine 
light field, typically, the reflected light from a surface. , f * j 1 j • r i*u iju j « j 

. . . . - \. * e . -, example object modeled is a face. It should be understood 

Digital images, including indmdual video frames, are fairly ^ ^ ^ ^ fee ^ ^ voluinetric (3D) 

low-resolution samplings of reflectance. However, if there is j 1 • *• • f ^n 1 j * * u j * / 

F ^7 . , ' « models m a tune series of 3D volume data sets, each data set, 

motion in a sequence of images, then every image gives a . ff . . • «• „ r . u 1 * ■ „ 

i- j*<£ » v r.L 1- 1 1 , ... . c in effect bemg an image of the volume at an instance in 

slightly different sampling of the light field, and this infor- ^ & 0 

mation can be integrated over time to give a much denser . , _ _ 

sampling, also known as a super-resolution texture. A se ! ° f ^^V'fl ^ ^TJ** ^ 

T . 7 . , . , , . r points 102 of the model of the object 101 in every image 104 

It is desired to provide super-resolution textures from a of , he nce of ■ s 103 £ach loca)ion cafl be 

low-resolution sequence of images. expressed with 2D or 3D matrix coordinates. A number of 

SUMMARY THE INVENTION computer vision techniques are known for determining the 

set of correspondences 110. In the preferred embodiment of 

The method according to the invention extracts texture- the present invention, the set of correspondences 110 is 

maps and texture-functions, generally "textures/' from a 45 obtained as described in U.S. patent application Ser. No. 

sequence of images and image-to-image correspondences. 09/791,117 "Modeling Shape, Motion, and Flexion of Non- 

The extracted textures have a higher resolution and finer Rigid 3D Objects Directly from Sequence of Images," filed 

detail than the sequence of images from which they are concurrently by Brand on Feb. 22, 2001 herewith, and 

extracted. If every image is annotated with control incorporated herein in its entirety. 

parameters, then the invention produces a procedure that 50 Fia 3 shows a melnod for constructing static (top) and 

generates super-resolution textures as a function of these dynamic (bottom super-resolution textures according to the 

parameters. This enables dynamic textures for surfaces or invention, 

volumes that undergo appearance change, for example, for Static Super-Resolution Textures 

skin that smooths and pales when stretched and wrinkles and M shown in me t0 p.p 0 rtion of FIG. 3, to construct static 

flushes when relaxed. These control parameters can be 5S sup er-resolution textures 341 for the model 100, using the 

determined from the correspondences themselves. image-to-image correspondences 110, first connect 310 the 

In the following text the term "image" will be used for 2D points 102 of the model 100 to generate triangular texture 

or 3D arrays of data and "video" will be used for time-series surface meshes 105. Connecting the points turns them into 

of such arrays. vertices. For clarity, the triangle texture meshes 105 are only 

More specifically, the invention provides a method that 60 partially shown. In practice, the meshes 105 cover all parts 

constructs a super-resolution texture from a sequence of of the model to be texture -mapped. The points 102 can be 

images of a non-rigid three-dimensional object. Hie object triangulated manually, or automatically via the well-known 

need not be non-rigid. A shape of the object is represented Delaunay process. 

as a matrix of 3D points. A basis set of possible deformations The set of correspondences 110 give the location of each 

of the object is represented as a matrix of displacements of 65 triangle vertex in every image 104. To construct the super- 

the points. The matrices of 3D points and displacements resolution textures for each of the triangles, the texture 

form a model of the object in the video and its possible contained in every image 104 of the sequence of images 103, 
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within each corresponding triangle, is warped 320 to a where R 201 is an orthographic projective rotation matrix, B 

common coordinate system 321. 202 is the matrix of the 3D points 102, C 203 is a vector of 

In the course of warping 320, the texture from each image flexions, i.e., deformation coefficients or control parameters 

104 is super-sampled to the desired texture resolution. The 359^ I is the identity matrix, D 205 is a matrix of k linearly 

warped, super-sampled triangle-textures 331 for all images 5 separable deformations of the model, and T 206 is a 2D 

are then averaged 340 to produce the super-resolution tex- translation matrix. The deformations 203-205 are weighted 

hires 341. In the preferred embodiment, bilinear warps and by mc fl cx io n coefficients 207. The rotation matrix R drops 

Gaussian sampling kernels are used. But other sampling and me depth d i mcns io n) making the projection orthographic. If 

warping techniques can also be used. ±e basis sct of deformations in D includes the base shape 

Note that this triangle-by-triangle process can be done on 10 ^ CQ ±e orthographic projection is scaled. Methods for 

all triangles at once by warping the texture in each image determining these variables directly from the sequence of 

104 to a canonical positioning of the points 102, and the images are described in the related patent application by 

averaging 340. This canonical positioning is essentially the Brand as cited above 

texture-coordinates of the vertices in the resulting texture- The parameters R, C, and T take on different values in 

ma P* t 15 every image. The parameters B and D are intrinsic to the 

Dynamic Textures physical object depicted by the sequence of images. Static 

Many objects change surface appearance as they change texture recovery requires correspondences (P) to be deter- 

shape (articulate). For example, the skin around the eyes and mined for many dynamic texture also requires 

mouth wrinkles as the muscles underneath contract. These 00^1 parameters (Q. The texture can be combined with 

surface changes arc usually too fine to be captured in the ^ me modd ^ D ) a novel sequence of control (C) and 

geometry of the model 100. However, they can be simulated motkm ^ ^ parameters to generate novel animations of 

by a dynamic texture-map that vanes as a function of how me ob j cct with a mgh dcgrcc of rcaIism . -t^ {&f fo c novcl 

the object articulates. animations are different than those appearing in the input 

To do so, construct the warped super-sampled textures sequence of images. In addition, the super-resolution tex- 

331 of each image, as described for the top portion of FIG. ^ tares can be com b me d with the original control and motion 

3. Then, as shown in the bottom portion, each such texture parameters to resynthesize the original sequence of images 

331 is vectorized 350 and made into one column 351 of a at reso lutions. Note that super-resolution resynthesis 

texture -matrix (T-M) 352. 0 f me original sequence of images can be done directly from 

Control Parameters me textures and correspondences without reference to the 

Vectors of control parameters 359 are provided for each 30 above model or 3D information, 

image 104. The control parameters model the changes of mve ntion is described using specific terms and 

shape in the model 100, for example, the degree of muscle examples. It is to be understood that various other adapta- 

contraction, or other articulations of the model. The control ^ modifications may be made within the spirit and 

parameters can be obtained from computer vision analysis of scope of the invention. Therefore, it is the object of the 

the sequence of images 103, or directly from the set of 35 appended claims to cover all such variations and modifica- 

correspondences 110, for example as a low-dimensional ^ons as come within the true spirit and scope of the 

representation of the correspondences such as can be invention 

obtained from a truncated principal components analysis j c laim* 

(PCA). The control parameters can also be in the form of x A method for constructing a super-resolution texture 

flexion vectors extracted from the sequence of images as ^ from a sequence of images of a non-rigid three-dimensional 

described below. object> comprising: 

The control parameters 359 for each image are arranged , f 4 , . ■ 4 c ^ • . 
-ii 1 c 4. 1 w\ representing a shape or the object as a matrix or 3D points, 

260 similarly as columns 261 of a control matrix (C-M) 262. j u ■ r j J c lL l- * 

-n_ 41 * • - *i_ j j -iTrt • * iu * * and a basis of possible deiormations of the object as a 

The control mttnx 362 is then divided 370 mlo the tex ure ^ rf ^ ]acemcnts of ^ 3D ^ ^ ^ atrices 

matt* 352 to obtain a matnx 371 storing a basis set of 45 of 3D ^ ^ ^ lacements for £j a model of the 

textures. An inner product 380 of this matrix and any new . . c • 

4 , 4 r . . . . object m the sequence of images; 

control vector-parameter 379 yields a dynamic texture- , . . , , , 

vector 381 appropriate to that control setting. The texture- determining a set of correspondences between the points 

vector 381 is stored 390 into a texture-map buffer 391 and m the model ™ d me ob J ect m me m ^ 

can be used to texture the appropriately deformed 3D model. 50 connecting the points in the model to form a mesh; 

Note that this is a linear regression model that predicts warping, for each image, the mesh to a common coordi- 

texture from control parameters. However, any non-linear nate system while super-sampling texture in each 

regression model that can be fitted to the texture and control image covered by the mesh; and 

pairs can be used instead. averaging the super-sampled textures covered by the 

Novel Video Formation 55 warped meshes to produce the super-resolution texture 

The above described method can also be used to render of the object in the image. 

395 a novel video 396 by using new textures 397. For 2. The method of claim 1 wherein the set of correspon- 

example, an input sequence of images 103 of a frontal view dences is obtained directly from the sequence of images, 

of a talking head can be rendered as the novel video 396 3. The method of claim 1 wherein the points are connected 

where the head is viewed sideways, or at an oblique angle. 60 by a Delaunay process. 

Although the preferred method uses computer vision 4. The method of claim 1 wherein the warping is bilinear, 

techniques, it should not be construed as the only way such and the super-sampling uses a Gaussian sampling kernel, 

information can be obtained from the sequence of images. 5. The method of claim 1 further comprising: 

As shown in FIG. 2, the shape and motion, i.e., the vectorizing each super-sampled texture as a column in a 

projection or pose P 200, of a non-rigid 3D model 100 onto 65 texture matrix* 

the image plane can be expressed by providing, for each image, a vector of control parameters 

^•^2x3( B 3xn+< c i*jfc© / j) £, 3*x«)© J *2xi» 0) ror ^ c mesh as a column in a control matrix; 
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dividing the control matrix into the texture matrix to 

obtain a basis matrix; 
forming an inner product of the basis matrix and a vector 

of new control parameter to obtain a new texture- 

vector. 

6. The method of claim 5 further comprising: 
generating a new texture matrix from the new texture- 
vector; and 



applying the new texture- matrix to the model while 
deforming the model to obtain a new image of the 
object. 

7. The method of claim 1 further comprising: 
resynthesizing the sequence of images using the super- 
resolution texture. 
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